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ABSTRACT 


A new data assimilation system has been designed and implemented at the 
National Center for Aeronautie Meteorology and Climatology of the Italian Air Foree 
(CNMCA) in order to improve its numerieal weather predietion eapabilities and provide 
more aeeurate guidanee to operational foreeasters. The system, whieh is undergoing 
testing before eventual operational use, is based on an “observation spaee” version of the 
3D-Var method for the objeetive analysis component, and on the High Resolution 
Regional Model (H.R.M) of CNMCA for the prognostic component. New features of the 
system include completely rewritten correlation functions in spherical geometry, 
derivation of the objective analysis parameters from a statistical analysis of the 
innovation increments, introduction of an anisotropic component in the correlation 
functions, solution of analysis equations by a conjugate gradient descent method. The 
analysis and forecast fields derived from the assimilation system are subjectively and 
statistically evaluated through comparisons with parallel runs based on European Centre 
for Medium Range Weather Forecast (ECMWF): preliminary results of these studies are 
also presented. 
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I. INTRODUCTION 


Already in 1911 Bjerknes had clearly shown that the problem of weather 
forecasting could be thought of as an “initial condition problem” (Bjerknes, 1911). By 
this term we indicate the mathematical problem of predicting the future state of a physical 
system once the initial conditions of the system are known together with the equations 
governing its evolution in time. 

In order to solve it successfully Bjerknes had also proposed an operative 
procedure based on three main components: 

1. The Observing component; 

2. The Diagnostic component; 

3. The Prognostic component. 

A worldwide network of in-situ and satellite-based observing systems today 
composes the Observing component. The average daily numbers of the more common 
observations disseminated over the Global Telecommunications System to the main 
weather centers are summarized in Table 1.1 (ECMWF Global Data Monitoring Report, 
November 2001) 

This enormous quantity of data is composed of observations irregularly 
distributed in space and taken at different and often “asynoptic” times. The Diagnostic 
component of the forecasting system is responsible for producing an estimate of the 
“true” state of the atmosphere over a regular spatial grid at a given time. 

Starting from this well defined initial state, the “primitive” equations describing the 
behavior of the atmospheric system are marched forward in time in order to produce an 
estimate of the state of the atmosphere at some future time {Prognostic component). 

The main subject of the present thesis is the new diagnostic component of the 
CNMCA forecasting system: its design, implementation issues and preliminary results. 
The motivation for this work lies mainly in the deeply felt need to be able to take 

advantage of the increasing number of satellite derived observations which are nowadays 
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available and that cannot be easily accommodated in less recent objective analysis 
algorithms. 

Also a brief description of the limited area model (HRM) used in the Prognostic 
component of the system will be given, together with a discussion of the main types of 
observations currently used and the relevant observation error statistics. 


Observation type 

Number of Obs. 

Deseription 

SYNOP/SHIP 

53268 

Surface weather observations 
over land or on the sea 

BUOY/DRIFTER 

6300 

Observations from moored/ 
drifting buoys 

TEMP 

1162 

Upper air radiosonde 

observations 

TEMP/PILOT 

300 

Upper air wind observations 

AIRCRAFT 

37359 

Automatic and manual 

observations of Temperature and 
wind from aircraft 

SATOB 

230970 

Cloud motion winds from 
geostationary satellites imagery 

NOAA 15/16 ATOVS 

612488 

Polar satellite derived 

temperature and humidity profiles 


Table 1.1 Meteorological observations disseminated daily over the Global 
Telecommunications System (GTS) 


A. ATMOSPHERIC DATA ASSIMILATION 

According to a widely accepted definition (Daley, 1991) a modern data 
assimilation system is composed of four main components: 

1. Data quality control; 

2. Objective analysis; 

3. Initialization of the analyzed fields; 

4. Short range run of the prognostic model in order to produce an initial estimate of 
the atmospheric state for the successive analysis step {First guess or background 
fields). 
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A schematic representation of the 6-h intermittent data assimilation system of 
CNMCA is given in Fig. 1.1. A brief description of components 1., 2. and 4. follows, 
while discussion of the proposed new realization of the objective analysis component is 
deferred to Chapters 2-3. 

1. Data Quality Control 

The data quality control step is of paramount importance in order to prevent 
erroneous data from being fed to the objective analysis step with deleterious results to the 
performance of the system. Many notable cases of recent failures of numerical weather 
prediction (NWP) systems to correctly forecast high impact weather conditions (1999 
Christmas storm in Europe, January 2000 snow storm over the eastern coast of the US) 
can be attributed to the inaccuracies in the initial conditions. Often these problems can be 
traced back to the rejection of good observations, or the inclusion of faulty ones, by the 
data checking algorithms. 

At CNMCA the data quality control of the observations is performed in two 
distinct steps. The first one, called "Observation Pre-Processing" has the purpose of 
assigning a degree of confidence to each reported datum. This is done through a series of 
checks that include: 


1. Check against climatological gross limits; 

2. Internal consistency checks (for example between reported and recomputed 
heights in TEMP messages); 

3. Temporal and spatial consistency checks for observations from moving 
platforms (for example SHIP and DRIFTER messages). 


Observations whose confidence level is below the 70% mark are discarded. More 
details can be found in Norris (1990). 
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To reduce redundancy of information and prevent possible numerical problems in 
the subsequent solution of the analysis equations, observations whose relative distance is 
less than the average grid distance of the numerical model (~55Km) are averaged 
together and combined in a "Super-Observation" (Lorenc, 1981). 

In the second step of the data quality control, the final decision on whether to 
accept for ingestion the observations that survived the pre-processing step is taken. At the 
moment the decision is based solely on the normalized distance of the observations with 
respect to the background field. Although this method is statistically accurate, it can lead 
to rejection of good data in cases of rapidly evolving and poorly forecasted weather 
situations. Work is in progress on a buddy-check type of algorithm for estimating the 
accuracy of marginal observations. 


2. Initialization 

After the objective analysis step has blended in a statistically “optimum” way the 
information from the first guess fields and the new observations, a set of analyzed 
meteorological fields is produced which is suitable for a “synoptic” use (i.e. diagnosis of 
current state of the atmosphere for nowcasting and short range forecasting purposes), but 
not as initial conditions for the integration of a primitive equation model. The main 
reason for this lies in the fact that the imposed balance between the wind and mass 
observation increments are linear simplified conditions (approx, geostrophic, non- 
divergent), while the first guess fields implicitly satisfy the multivariate nonlinear 
conditions of the numerical model. As a result, the integration of non-initialized fields 
would cause the model to go through a geostrophic adjustment process with the 
excitation of inertia-gravity waves and the consequent degradation and noisiness of the 
forecast fields in the first 6-12 h. 

To avoid these undesirable effects, the “Adiabatic Implicit Normal Mode 

Initialization” technique is used. A detailed explanation can be found in Temperton 

(1988), but the main ideas can be summarized as follows. The analyzed fields are 

projected over the normal modes of a linearized version of the model equations. These 
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normal modes can be classified (at least for the extratropics) based on their respective 
frequencies or propagation velocities: “fast” modes, corresponding to inertia-gravity 
waves, “slow” modes, corresponding to meteorological Rossby waves. The result of the 
projection operation are two sets of ordinary differential equations whose integration in 
time gives the time evolution of the amplitudes of the normal modes. The imposition of 
appropriate conditions (so called Machenauer conditions) on the time tendencies of the 
amplitudes of the normal modes leads to an effective filtering of the high-frequency 
modes and removal of spurious numerical noise (see pp.377-385 in Haltiner & Williams, 
1980). 


3. Prognostic Model 

The numerical model used to produce the first guess fields in the intermittent data 
assimilation scheme is the High-Resolution Regional Model (HRM) of CNMCA. The 
HRM is a modified version of the Deutscher Wetterdienst EM/DM model (Majewski, 
2001), adapted to run on Compaq Alpha servers. The main numerical features of this 
hydrostatic primitive equation model are summarized below: 


1. Lat/Lon rotated coordinates grid, 0.5° resolution; 

2. C-type Arakawa grid, 2"‘* order centered finite difference scheme; 

3. Hybrid vertical coordinate, 31 model levels; 

4. Split semi-implicit time integration scheme; 

5. Integration of Helmholtz equation through Fast Fourier Transform and Gauss 
method; 

6. Davies formulation of boundary conditions; 

7. 4* order diffusive damping term. 
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As for the physics package of the model, its main characteristics are: 


1. Rytter and Geleyn (1992) radiative scheme; 

2. Stratiform precipitation scheme with clouds microphysics parameterization; 

3. Mass flux convective scheme (Tiedtke, 1989); 

4. Two-level vertical diffusion scheme (Mellor and Yamada, 1974), similarity 
theory at the surface (Louis, 1979); 

5. Two-level soil scheme; 

The operational area of model integration is shown in Fig. 1.2. Embedded in this 
model is another version of HRM (called Med-HRM) run at double horizontal resolution 
(0.25° effective grid spacing) over the Mediterranean basin (Fig. 1.3). Three hourly 
ECMWF forecast fields give the boundary conditions for the HRM model. 

The interface between the objective analysis step and the prognostic model is 
realized through three additional software packages: 

1. Insertion: spline interpolation of analyzed fields on pressure levels to hybrid 
coordinate model levels; 

2. IFS2HRM: interpolation and adaptation of ECMWF boundary fields to HRM grid 
and prognostic variables; 

3. Daily blending of CNMCA 12Z analysis fields with ECMWF 12Z analysis fields. 


For testing purposes, the assimilation cycle is run on a Compaq DS20E server, 
with a 3-h data window around the analysis nominal time. The run time for an average 
number of independent observations (-3500) is around 45 minutes, which, considering 
the time necessary for the post-processing elaborations, leads to the availability of the 
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analyzed fields three hours after the analysis nominal time. Twice daily (at OOZ and 12 
Z), an extended run (+48h) of the HRM model based on the assimilation cycle analysis is 
performed. 
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Figure 1.1 Data Assimilation Cycle at CNMCA. 
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II. MULTIVARIATE VARIATIONAL DATA ASSIMILATION 


Spatial objective analysis of meteorological fields can be traced back to the 
pioneering work of a group of research workers (J. Von Neumann, J. Charney, J. 
Smagorinsky and others) at Princeton Institute of Advanced Studies in the late 1940s (see 
Daley, 1991, for more details and a historical perspective). The early techniques were 
based on the concept of function fitting: the meteorological field to be analyzed was 
expanded in a finite series of known functions. The coefficients of the expansion were to 
be determined by a least square minimization of some function of the distance between 
the fitting function and the observation. The procedure leads to the solution of a linear 
system or, equivalently, the inversion of a square matrix (Gram matrix), which, in the 
case of global fitting, can become extremely large and expensive to compute. Due to this 
and other technical problems (possible ill-conditioning of the Gram matrix, underfitting 
or overfitting due to the non-stationariety of the observing system) the function fitting 
technique was quickly superseded in operational practice by the method of successive 
corrections (SCM) (Bergthorsson and Doos, 1955). In this technique, the function fitting 
is local instead of global (i.e., only the observations within a predetermined radius of the 
analyzed grid point influence the analysis) and the weights are specified a-priori as 
monotonically decreasing functions of the distance between observation and analysis grid 
point. Besides introducing a very efficient and robust algorithm, Bergthorsson and Doos 
also brought in a number of other ideas which are still in wide use in operational 
objective analysis schemes: the use of a forecast first guess (or background) field and the 
computation of the a-priori weights through a statistical analysis of the objective analysis 
errors. For all these attractive properties, the SCM has some important limitations, at least 
in its simpler formulations. It gives too much weight to the observations and it ignores the 
cross-correlations between observations. 

Although most of the problems can be overcome with more sophisticated, iterated 
version of the algorithm (Bratseth, 1986), in operational practice the SCM algorithm was 
superseded in the 1980s by a direct method of solution of the analysis equations known 

as “Statistical Interpolation ” or “Optimal Interpolation ” (01). The roots of the method 
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go back to the work of Kolmogorov (1941) and Wiener (1949) and it is interesting to note 
that 01, as other recent teehniques, found its early applieations not in meteorology but in 
other fields of the Earth’s sciences and also in engineering. The seminal work that 
brought the teehnique to the attention of the meteorologieal community was that of 
Gandin (1963), who gave a theoretical derivation of the algorithm and discussed the main 
issues of its application in an operational environment. 

The 01 method derives its name from the faet that it tries to minimize (optimality 
principle) the expected varianee of the analyzed fields through the a-priori knowledge of 
the error charaeteristics of both the observations and the background fields. In operational 
practice a number of assumptions are made whieh make the method sub-optimal. Apart 
from fundamental issues sueh as the imperfect knowledge of the observations and 
baekground fields error statistics, which are also often assumed stationary, homogeneous, 
isotropie and separable into the horizontal (or constant pressure) and the vertieal 
eomponents, the main approximations eommonly used are: 

1. Loeal approximation: only a limited number of observations enter into the solution 

of the analysis equations for any given grid point (or analysis volume, Lorene, 

(1981)); 

2. Only observations that are linearly related to the prognostic variables of the model 

are eommonly used, although extensions of the method to handle non-linear 

observations are possible (Ledvina and Pfaendtner, 1995). 

In the 1990s, due to progress in computer processing speed and data storage 
eapabilities, the variational approaeh to the objective analysis problem has gained 
popularity as an elegant and powerful mathematical formalism capable of overcoming 
some of the problems mentioned above (Parrish and Berber, 1992; Rabier and Courtier, 
1992; Cohn et ah, 1998; Daley and Barker, 2000). The methods developed by these 
authors are known under the general name of 3D-Var algorithms and, apart from 
implementation details, are basieally equivalent. A scalar measure (usually ealled cost 
function) of the distanee between the observations and baekground fields to the analysis 
fields is minimized with respeet to the unknown analysis values, thus producing the 
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maximum likelihood estimate of the atmospheric state. In the case of quadratic cost 
functions, the minimization procedure leads to a set of linear analysis equations that are a 
generalization of the 01 analysis equations. This is not a mere coincidence since, for 
stochastic variables with a Gaussian error distribution (as many weather fields can be 
thought of having), it can be shown (Daley, 1991) that the minimum variance estimate 
and the maximum likelihood estimate are coincident. 

In the 3D-Var methods a basic assumption of 01 is retained, namely that of the 
stationary character of the background error statistics. Relaxation of this hypothesis leads 
to various modem analysis techniques, commonly known as Four Dimensional Data 
Assimilation Methods (FDDA), which can be roughly divided in two main categories 
(Menard, 1994): time extensions of variational methods, known as 4D-Var, where the fit 
to the data is performed over an extended period of time (instead of intermittently) using 
the model as a strong constraint (i.e., strict consistency with the model, assumed perfect, 
is required); methods derived from estimation theory (Kalman-Bucy filter and its various 
modifications and extensions). In these latter methods no perfect model assumption is 
required; however, common to both kind of algorithms are the great computational and 
storage requirements that, at present, make them unfeasible options except for the largest 
weather centers. 

In the present work we concentrate on the version of the 3D-Var algorithm which 
forms the basis of the new objective analysis system at CNMCA. To the author’s 
knowledge, it was first implemented operationally at the Data Assimilation Office, 
NASA Goddard (Cohn et al, 1998), and more recently at Naval Research Laboratory, 
Monterey (Daley and Barker, 2000). The basic theory will be reviewed, then a thorough 
description of the background error covariance models used in the CNMCA 
implementation of the algorithm will be given. Finally the topic of a possible anisotropic, 
flow-dependent extension to the standard covariance modeling will be addressed. 


A. THEORY OF 3D-VAR 

The following is only a short account of the main ideas of 3D-Var which are 

relevant to the present work: the interested reader can find more details and a wider 
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theoretical background in the works of Lorenc, 1986; Daley, 1997; Cohn, 1997; Cohn et 
ah, 1998; Daley and Barker, 2000. 

Let j be the column vector containing all the p (~10"^) observations at the analysis 
time; jc, the column vector of the true values of the n (~10^) state (prognostic) variables at 
the same time at the model grid points; Xa and Xb the analogous quantities for the analyzed 
and background fields; then the forecast error vector is: 




( 2 . 1 ) 


while the observation error vector: 


eo=y-H(X() ( 2 . 2 ) 

where H is the observation (or forward) operator, i.e. the operator that performs the 
transformation from the state variables on grid points to the observed variables at the 
observing locations (in the case of linearly related variables it reduces to an interpolation 
operator). We make the usual assumptions that these error vectors, e* and Co have normal 
distribution functions with zero mean (i.e.: no bias) and are mutually uncorrelated. 

From these we can now define the forecast error covariance matrix: 

Pb= < ebej> (2.3) 

and the observation error covariance matrix: 

R = <eoeo^> (2.4) 

By definition, both matrices are symmetric, positive definite, with dimensions nxn 
and pxp respectively. 
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Given the statististieal assumptions made on Cb and Cg, it can be shown (Daley, 
1991, Chap.2) that the maximum likelihood estimate of the state of the atmospheric 
system is the one the minimizes the following cost function; 

J = 0.5[y - H(xa)fR'‘[y - H(xa)] + 0.5[xb - XafPb‘[xh - xj (2.5) 

2 

i.e., minimize a scalar distance in L of the analysis fields from both the observations and 
the first guess fields based on their respective perceived accuracies. 

To find the minimum of the cost function (2.5), we differentiate it with respect to 
Xa, obtaining the gradient of J: 

VJ = - HV[y - H(xa)] + Pb'xb (2.6) 

where we made use of the property of bilinear symmetric forms dldx(x^Ax) = (A + A^)x 
= 2Ax, and introduced the Jacobian matrix of the observation operator H: 


H = d/dx(H) (2.7) 

A necessary condition for a minimum is obtained by setting VJ= 0: 


- H^R ‘[y - H(xa)] + Pb^Xb + H^R^H[Xa - xJ - H^R ‘H[Xa -xJ = 0 (2.8) 

T -1 

where the vector H R~ H[Xa - XbJ has been added and subtracted. Assuming the analysis 
fields to be only first order corrections to the first guess fields {^"Tangent Linear” 
approximation) we have: 


H(Xa) = H(Xb - (Xa - Xb)) =H(Xb) - H(Xa - Xb) 
which, when substituted into (2.8), yields; 


(2.9) 
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- H^R^[y - H(xh)] + (Pb'+ hVh) (xa -Xb) = 0 


(2.10) 


or, 


Xa-Xb = (Pb'+ hVh)^ H^R '[y-H(xb)] ( 2 . 11 ) 

This form of the analysis equations involves eovarianee matriees eomputed in the 
grid spaee of the model (or, equivalently, in its speetral spaee), whieh, in the present ease, 
would involve matriees of order -10^x10^ (although, in speetral space and under the 
assumptions of homogeneity and horizontal isotropy for correlations, they can be shown 
to be diagonal or block diagonal: Courtier, 1997; Courtier at al, 1998). 

A less expensive but equivalent form of the analysis equations can be derived 
making use of the Sherman-Morrison-Woodbury formula (for the details of the derivation 
see Courtier, 1997), or, more simply by noting that: 

H^R ‘ (Pb^+ H^R ‘H) ‘ = (HPb H^+ R) Pb (2.12) 

and that all the matrices considered and their inverse are positive definite. Inserting 2.12 
into 2.11 we find: 

Xa-Xb= Pb (HPb H^+ R)-‘[y - H(xb)J (2.13) 

This formulation is the one used in the present work. To the author’s knowledge, 
it was first implemented in the NASA/Goddard Data Assimilation Office “Physical Space 
Statistical Analysis System” (Cohn et ah, 1998) and, more recently, in the NAVDAS 
assimilation system of NRL Monterey (Daley and Barker, 2000). 

From inspection it is clear that we are dealing with an observation space 
algorithm: R is defined in observation space while Pb is projected into it by means of the 
observation operator H. Taking into account the average number of observations 
currently entering into the CNMCA assimilation system (~10"^), we can see that this 
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approach is able to considerably reduce the computational and storage load on the 
computing facilities without placing any constraint on the eovariance function models. 

The main differences with respect to standard 01 algorithms are twofold; first 
there is the explicit use of the observation operator H, which will provide a natural way to 
extend the objective analysis to include observations related in a nonlinear way to the 
prognostie variables of the model (see below); seeondly, as it will be shown below, the 
solution to (2.12) is computed globally (i.e.: making use of all available observations for 
every grid point), without the need for data selection procedures. 


The extension of (2.12) to non-linear observations (sueh as radiances, wind 
speeds, total column water content, ete., where H depends on the model state x) is 
conceptually simple: we define a series of system states: 


Xq ^2y * • •y^i-h^b^i+ly • ♦ • 


(2.14) 


where the starting state eoincides with the first guess fields. 

Then, defining the Jacobian of the observation operator around the i state of the 

system 


Hi = d/dx(H)^^i (2.15) 

the nonlinear version of the analysis equations is found by looking for a Newtonian 
iterative solution to the problem of finding the root of the Jaeobian of the cost function 
( VJ=0; eq. 2.8) : starting from a first guess state Xo=Xb, the i+1 state is given by: 

Xi+i = Xi-S^J(XiX^ ■ VJ(xi) (2.16) 

where \^J(Xi)~^ = (Pt^ + H^iR'^ Hi) is the Hessian matrix of the cost function J and it can 
be shown to be the inverse of the analysis error covarianee matrix. 
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After some manipulation, 2.16 can be recast in the computationally more efficient 


form: 


Xi+, = xo + Pb H^i (Hi Ph H^i + R)-'[y - H(xi) + H(Xi - xo)J (2.17) 

The iterative procedure stops when the difference between Xi+i and Xi becomes 
smaller than a predefined value. Unlike the linearized version of the analysis equations, 
which result from a quadratic cost function, the minimization procedure (2.15) cannot be 
guaranteed to converge and it can have more the one minimum. Up to now only 
observations linearly related to the prognostic variables of the model have been used in 
the experimental phase. 


B. IMPLEMENTATION ISSUES 

The practical implementation of objective analysis equations (2.13) has been 
carried out as follows. The set of equations (2.13) can be solved in two steps. First 
solution of the linear pxp system: 

(HPbH^+R)z=y-H(xh) (2.18) 

in the unknown vector z. Secondly, projection of the solution on grid space via: 

Xa-Xb=PhH^z (2.19) 

The second step amounts to perform a matrix-vector product between the pxn 
matrix Pb H and the p vector z, which can be computed efficiently in Fortran90 code. 

The first step involves the solution of a large, sparse, symmetric and definite 
positive linear system. It can be shown (Golub and van Loan, 1996) that this step is 
mathematically equivalent to finding the minimum of the following cost function: 

F(z) = l/2z^(HPb H^+ R)z -z'^(y- H(xb)) (2.20) 

which can be done using a standard Conjugate Gradient (CG) Algorithm (Subroutine 
FI IGBZ, NAG Library, Mark 17,1995, has been used). 
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In numerical experiments, it has been found convenient to implement a scaled 
version of 2.13. This is due to the reason that, in this way, the condition number of the 
matrix (HPbH^ + R) (defined as the absolute value of the ratio of the maximum to the 
minimum eigenvalue) shows a considerable decrease, thus speeding up the convergence 
of the descent algorithm. 

The scaled version of (2.13) is found following Daley and Barker (2000). If we 
redefine the observation operator H as the product of a spatial interpolation operator /T* 
and a real observation operator H (i.e. H* H) and set: 


Pb h/= and, H* Pb hJ= 

(2.21) 

p^gr/objjT^ fj^gr/ob 

(2.22) 

jjp ob/ob jjT objl/2 ^ ob/ob objl/2jjT 

(2.23) 

Sh= diag(H* Pb’'^°'' H/) 

(2.24) 


Where Sb=diag{Pb), (i.e. the background variances of the analysis variables), 
Sb'’=diag{Pb^), (i.e. the observation variances of the analysis variables), and 
are the corresponding correlation matrices., then the scaled form of 2.13 is: 

Xa-Xb= Sb'^' fSb^Y'H"^ - H(xb)] 

(2.25) 

and = Sh^^^H[Sb''f^ Cb''^°’'[SbY^H^Sh^^\ 

The main point here is the rescaling of the (H Ph + R) covariance matrix in the 
non-dimensional form [Ch^^°^+ SY^R SY^]. 

In experimental runs, without preconditioning, for an average number of 
observations (p~3500), the cost function (2.20) converges to within machine precision in 
around 150 steps (average time ~90 seconds on Compaq DS20E server). Since the 
computational cost of the algorithm scales as number of CG iterations multiplied by , it 
is important, in view of ingesting asynoptic type of observations, to implement an 
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effective pre-conditioner for the (H Ph + ify) matrix. Work in this direction is still in 
progress. 


C. BACKGROUND ERROR COVARIANCES 

"'The most important element in the statistical interpolation algorithm is the 
background error covariance matrix. To a large extent, the form of this matrix governs 
the resulting objective analysis... ” (Daley, 1991). Much effort has so far gone into the 
redefinition of the background error covariance model. The derivation follows the main 
ideas of Bergman (1979), Thiebaux et al. (1986), Thiebaux at al. (1990), Tillmann 
(1999). An original contribution has been the explicit formulation of the temperature- 
wind cross correlations in spherical coordinates, using the thermal wind relationship as a 
constraint. This, to the author’s knowledge, is not to be found in the relevant literature. 

The need to express the auto and cross-correlations in spherical coordinates 
instead of using a local plane projection arises from the fact that, for the correlation 
model used, one has non negligible correlation values at distances comparable to the 
Earth’s radius. 

Taking into account the narrower structure shown by the background vertical 
correlations of temperature with respect to those of geopotential and the weaker vertical 
correlation of radiosondes’ temperature measurements with respect to geopotential 
observations, the choice was made to adopt the following analysis variables: temperature 
(T), zonal (u) and meridional (v) components of wind, surface pressure (SP) and relative 
humidity (RH). 


1. Upper Air Analysis Covariance Model 

The upper air analysis is multivariate in (T, u, v). The covariance model has been 
derived as follows. 


Starting from the geostrophic constraint: 
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u = -(k/sin(p) d^ldcp 
V = k/(sin(pcos(p) 


(2.26) 

(2.27) 


where ((p, X) are the latitude-longitude eoordinates, 0 is the geopotential, and k=n/2Q is a 
eonstant that takes into account the geostrophic coupling parameter ji and the orbital 
angular velocity of the Earth Q. Making use of the equation of state for dry air (p=pRdT) 
and the hydrostatic equation {dpld0=-p), (2.25-26) can be recast as a form of the thermal 
wind constraint: 

du/dp = kRc/(p sincp) dTIdcp (2.28) 

dv/dp = -k/(psin(pcos(p) dTIdX (2.29) 

Under the usual hypothesis of homogeneity, isotropy and separability for the 
temperature autocorrelation function, we assume the following functional form for the 
temperature covariance: 


Cov(n Tj) = ar' R(x) x^'^iPuPj) (2.30) 

where ar is the background error temperature variance (which can vary both in latitude 
and in pressure level), R(x) is the quasi-horizontal (isobaric) component of the correlation 
model (function only of the Great Circle distance x = cos~^(sincpisincpi + coscpicoscpj 
cos /zl/l/)of points ij) and/^^ is the vertical part of the correlation function. Following 
Thiebaux et al. (1986) and Daley and Barker (2000), the functional representation for 
R(x) has been chosen as a Second Order Autoregressive (SOAR) Function of the form: 

R(x) = (1 +c x)exp(-c x) (2.31) 

Where the length scale c‘ will be specified through a statististical analysis of the 
observed minus forecast increments, as will be shown in Chapter 3. 
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The derivation that follows is, however, independent of the speeific form of either R(t) or 
Making use of (2.27-28) is easily seen that: 


d/dlnpi Cov(Ui,Tj) = gt kRd/sin(pidR(x)/dx dz/d(piX^^(Pi,Pj) (2.32) 

d/dlnpj Cov(Ti,Uj) = gt kRd/sincpj dR(x)/dx dx/d(pj x^^(Pi,Pj) (2.33) 

d/dlnpi Cov(Vi,Tj) = - gt kRd/(sin(piCos(pi) dR(x)/dx dx/dXix^^ (puPj) (2.34) 

d/dlnpj Cov(Ti,Vj) = - gj kRci/(sin(pjCos(pj) dR(x)/dx dx/dXj x^^ (Pi,Pj) (2.35) 

^/(dlnpidlnpj) Cov(Ui,Uj) = (GjkRdf/(sirKpiSincpi) (d^R(x)/dx^dx/dcpidx/dcpj + 

dR(x)/dx ^x/d(pid(pj) x^^(Pi’Pi) (2.36) 

^/(dlnpidlnpj) Cov(Vi,Vj) = (gt kRdf/(sin(piCos(piSin(piCos(pj) (d^R(x)/dx^dx/dkidx/dkj 
+ dR(x)/dx ^x/dkidkj) x^^(Pi,Pj) (2.37) 

^/(dlnpidlnpj) Cov(Ui,Vj) = - (gt kRdf/(sin(piSin(piCosfj) (d^R(x)/dx^ dx/d(pidx/dkj + 
dR(x)/dx ^ x/dcpidkj) x^^(Pi>Pj) (2.38) 

^/(dlnpidlnpj) Cov(Vi,Uj) = - (gt kRdf/(sin(piSin(piCos(pj) (d^R(x)/dx^ dx/d(pjdx/dki + 


dR(x)/dx ^x/d(pjdki) x^^(Pi’Pj) (2.39) 

Similarly to (2.29) the other covarianees ean be written as: 

Cov(Ui,Tj) = Gu GTR^^(ri,rj)x"^(Pi,Pj) (2.40) 

Cov(Vi, Tj) = Gy GrR^^i^rj) x'’^(PiPj) (2.41) 

Cov(Ui,Uj) = Gu R""(ri,rj) x““(Pi,Pj) (2.42) 

Cov(Vi,Vj) = Gy Rk^(ri,rj) x'’'’(Pi,Pj) (2.43) 

Cov(Ui,Vj) = Gu Gy R“''(nrj) x“'’(Pi,Pj) (2.44) 

and substituted into (2.25-32), thus obtaining for the vertical components: 

x““(PiPj) = X^(PiPj) = x“'’(Pi’Pj) = x'’“(PiPj) (2.45) 
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x“^(Pi>Pj) =X^(PiPj) = d/dlnpjx‘“(Pi,Pj) (2.46) 

X^“(Pi’Pj) =X^'’(PiPj) = d/dlnpix"‘'(Pi,Pj) (2.47) 

X^^(Pi’Pj) = ^/(^Inpidlnpj) x““(Pi,Pj) (2.48) 

while for the isobarie eomponents we find: 

cr„ R“^(ri,rj) = ot kRd/sin(pi dR(x)/dx dx/d(pi (2.49) 

Ov R^^(ri,rj) = - Ot kRd/(sin(pi cos(pi) dR(x)/dx dx/dXi (2.50) 

Ou R"“(ri,rj) = (aTkRdf/(sin(piSin(pi) (d^R(x)/dx^dx/d(pidx/d(pj + 

dR(x)/dx^x/d(pid(pj) (2.51) 

au cTv R“'’(ri,rj) = - (aTkRdf/(sin(pi sincpi cos(pj) (d^R(x)/dx^dx/dcpidx/dkj + 

dR(x)/dx^ x/dcpidkj) (2.52) 

Ov R''''(ri,rj) = (oTkRdf/(sin(piCos(piSin(piCos(pj) (d^R(x)/dx^dx/dlidx/dlj + 

dR(x)/dx^ x/dXi^j) (2.53) 

Setting 

Lim(t^o) dR(x)/dx =L (2.54) 

we obtain for the geostrophically constrained wind variances: 

2 2 2 
Lim(t^o) Cov(Ui,Uj) = Ou = Lim(t^o) Cov(Vi,Vj) = = -(ajkRd/sifKpi) L (2.55) 

Substituting (2.54) into (2.48-52) we completely determine the isobarie correlations: 

R“^(ri,rj) = dR(x)/dx dx/dtpi (2.56) 

RX^(ri,rj) = - (cos(pi)'^ dR(x)/dx dx/dXi (2.57) 

R'^'^(ri,rj) = (d^R(x)/dx^dx/d(pidx/d(pj + dR(x)/dx^x/d(pid(pj) (-L)~^ (2.58) 
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R“^(ri,rj) = - (d^R(x)/dx^dz/dfidr/dXj + dR(x)/dx^x/d(pidXj) (-L cos<pj)~^ (2.59) 
R^''(ri,rj) = (d^R(x)/dx^dx/dXidx/dXj + dR(x)/dx^x/dXidXj) (- L coscpi cos(pj)~^ (2.60) 


The above formulas are independent of the chosen correlation model. In the 
present case, assuming (2.29) as the functional representation for the isobaric temperature 
autocorrelation model, we obtain: 


R“^(ri,rj) = c(x/sinx)exp(-cx)( cosfi sinfj- cosfj sinfi cos I AX I) (2.61) 

R^^(nrj)=-R^%,rj) (2.62) 

RX^(ri,rj) = c(x/sinx)exp(-cx)(cos(pj sin(Xi-Xj)) (2.63) 

R^^(nrj)=-R^^(ri,rj) (2.64) 

R^'^(ri,rj) = - ((sinx(l-cx)- xcosx)/(sin x) (coscpi sincpj- coscpj sin<pi cos I AX j) 

(coscpj sincpi- coscpi sincpjcos IaxI) - (x/sinx) (cosfi sincpj- 

cosfj sin(pi cos /AX /)) exp(-cx) (2.65) 

RX^(ri,rj) = ((sinx(l-cx)- xcosx)/(sin x) (coscpi coscpjsin j AX j) + 

(x/sinx) cos IaxI)) exp(-cx) (2.66) 

R“^(ri,rj) = ((sinx(l-cx)- xcosx)/(sin\) (coscpi sincpj-cosfj sin(pi cos IaxI) 

(cos(pi sin(Xi-Xj) + (x/sinx) (sincpi sin(Xi-Xj)) exp(-cx) (2.67) 


The above isobaric correlations are shown in Fig.2.1 through 2.6. 
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Figure 2.1 T-T correlation function (SOAR Model, c = 0.1 rad). 


U-T LornUillBn RlteUeti 



Figure 2.2 U-T correlation function (SOAR Model, c = 0.1 rad). 
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Figure 2.3 V-T correlation function (SOAR Model, c = 0.1 rad). 
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Figure 2.4 U-U correlation function (SOAR Model, c = 0.1 rad). 
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V-V ComlaliBn P'undMin 



Figure 2.5 V-V correlation function (SOAR Model, c = 0.1 rad). 
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Figure 2.6 U-V correlation function (SOAR Model, c = 0.1 rad). 
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The vertical part of the correlation function has been modeled after Bergman 


(1979): 


x““(Pi’Pj) = (1 +kplog^ (p/Pj)/^ ( 2 . 68 ) 

which, for the other correlations, gives: 

X^(Pi’Pj) =x“''(Pi’Pj) = x'’“(Pi’Pj) = (^+kplog^(pi/pj))~^ (2.69) 

x“^(PiPj) =x'’^(Pi’Pj) = 2kp^Hog(pi,pj) (l+kplog^(pi/pj)f (2.70) 

X^yPiPj) =X^'’(Pi’Pj) = -2kp^^log(pi,pj) (l+kplog^(pi/pj))~^ (2.71) 

X^^(PiPj) =(1- 4kplog (pi,pj) x'“(Pi,Pj)) (l+kplog^(pi/pj)X^ (2.72) 


Sketches of the functions are given below for kp=5, in Fig. 2.7 through 2.10. 

The vertical correlation parameter kp will be determined through a statististical 
analysis of the observed minus forecast increments, as will be shown in Chapter 3. 
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Figure 2.7 U-U Vertical Correlation function {kp = 5) 



U vIkhI >Cair«laLKKi u—T 

Figure 2.8 U-T Vertical Correlation function {kp = 5) 
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Figure 2.9 T-U Vertical Correlation function {kp = 5) 



Figure 2.10 T-T Vertical Correlation function {kp = 5) 
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It is also useful to derive the mutual eorrelations of temperature and wind field 
with the geopotential. In order to do this, elimination of u,v from (2.26-2.29) leads to: 

dT/dq) = -l/(Rd) ^^ld(pdln(p) (2.73) 

drm=-i/(Rd) ^mMn(p) (2.74) 

From these relations, assuming that the geopotential autocovariance can modeled 
as: 


Cov(^, 4>j) = R(x) (Pi,Pj) (2.75) 

we can derive the correlations: 

d/d(p i Cov(Ti, 0j) = -a^R/^ dR(x)/dx dx/d(pid/dln(pi) (Pi>Pj) (2.76) 

d/d?iiCov( (Pi,Tj) = -ajRd^ dR(x)/dx dx/d?ijd/dln(pj)y^ (Pi,Pj) (2.77) 

^/dXidcp i Cov(Ti, Tj) = -a^Rd^ dR(x)/dx ^x/dXidcpi ^/dln(pi)dln(pj)y^ (P^Pj) 

(2.78) 

Separating the vertical and isobaric components of the correlations, we find: 


X^%iPj) = d/dlnpiy^ (pupj) = y^“(pi,pj) (2.79) 

X^^(PiPj) = d/dlnpjy^ (pi,pj) = x‘‘^(PiPj) (2.80) 

X^^(Pi,Pj) = ^/(dlnpidlnpj) y**(pi,pj) =y“"(pi,pj) (2.81) 

while the isobaric components are given by: 

aT(7^R^^(ri,rj) = -aj Rd^ dR(x)/dx dx/dXj (2.82) 

gto^R^^ (ri,rj) = -a^ Rd^ dR(x)/dx dx/dxpi (2.83) 

gt R^^(ri,rj) = Rd^^x/dXid(pidR(x)/dx (2.84) 
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From these expressions we are also able to derive the geostrophieally constrained 
autocovariances: 


Lim(r^o) Cov(Ti,Tj) = oj = Rd'^Lim(^^o) Cov(<Pi,4>j) = Rd^aj (2.85) 

In analogous fashion, starting from (2.25-2.26) the cross correlations between the 
wind components u,v and the geopotential can be derived. The results of the 
computations are: 


(Pi,Pi) = X^(Pi.Pj) = x'"(Pi,Pj) (2.86) 

X%i.Pj) = X‘^ (Pi,Pj) = X^(Pi,Pj) = x'^yPi^Pj) (2.87) 

while for the isobaric cross correlations we find: 

au(74>R‘‘^(ri,rj) = k/sincpi dR(x)/dx dx/dxpi (2.88) 

a 40 uR^(fi,fj) =-aj k/sin(pj dR(x)/dx dx/dxpj (2.89) 

ava 4 ,R'’^(ri,rj) = aj k/(sin(pi coscpi) dR(x)/dx dx/dXi (2.90) 

R^(fi,rj) k/(sin(pj cos(pj) dR(x)/dx dx/dXj (2.91) 


2. Surface Analysis Covariance Model 

The surface analysis covariances used in the objective analyses of the surface 
pressure (SP), mean sea level pressure (MSLP) and 10-meter wind fields are a simplified 
version of the models used in the upper air analysis. 

In this case only the geostrophic constraints (2.26-2.27) apply. The computations 
are similar to the ones described above and lead to analogous results, with the notable 
exception that, assuming that the pressure autocovariance is modeled as: 

Cov(pi,pj) = ap^ R(x) (2.92) 

Then, the geostrophieally constrained wind variances are given by: 
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2 2 2 
Lim(t^o) Cov(Ui,Uj) = Ou = Lim(r^o) Cov(Vi,Vj) = = -(apk/psin(pi) L (2.93) 


where L is given by (2.47). For the SOAR model of (2.24) the wind component 
autocovariances reduce to: 

aj = = (opkc/psincpif (2.94) 

It is clear that this correlation model is really applicable only under the provision 
that the geostrophic components of the observed surface winds have been extracted. This 
requires the use of an appropriate boundary layer model. Work in this direction is under 
way, making use of the Planetary Boundary Layer model developed by R.A. Brown and 
coworkers (Brown and Levy, 1986). 


3. Anisotropy and Flow Dependency of Covariance Functions 

An implicit assumption of 3D-Var algorithms (and, in general, of intermittent 
assimilation schemes) is the stationary character of the background error covariances. 
This simplification, together with those of isotropy and homogeneity, is widely used in 
operational settings due to the reason that taking into account the temporal evolution of 
the background covariances would greatly increase the complexity of the objective 
analysis algorithm and the burden placed on the computing resources. However, it is very 
well known (Daley, 1991; Otte et ah, 2001), that the observed-minus-background 
correlation patterns of weather systems show considerable anisotropy (the SW-NE tilt of 
upper level trough axis, for instance), which make the isotropy and stationarity 
approximations serious shortcomings of data assimilation systems which do not take the 
time evolution of the covariance matrices into account. 

The problem has been tackled in a variety of ways. In the context of variational 
assimilation there are two main approaches: 
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1. the 4D-Var scheme, where the background covariance matrix is evolved 
implicitly by the dynamics of the tangent linear model (and its adjoint; see 
Rabier and Courtier (1992) for details); 

2. the Kalman-Bucy filter, wich explicitly evolves the background covariance 
matrix, and the many approximations which have been proposed in order to 
make this method computationally tractable (“Ensemble Kalman filter”, 
Evensen,1994; “Reduced-Rank Kalman filter”, Eisher, 1998; “The Cycling 
Representer algorithm”, Xu and Daley,2000); 

Only the 4D-Var approach has been implemented in operational environments 
even though many approximations are used in practice (ECMWE; Rabier et ah, 2000). 
The Kalman filter method is under active study, but it does not seem feasible to be 
implemented in an operational setting with the current generation of computers. This is 
due to the fact that the method requires an explicit computation of the analysis error 
covariance matrix and its evolution in time and this matrix, for current numerical weather 
prediction models, is of the order of 10^x10^. 

Due to the unavailability at CNMCA of the human and computer resources necessary 
for tackling the problem of the temporal extension of 3D-Var, possible ways to 
circumvent the main shortcomings of the algorithm were investigated. The approach we 
have chosen was pioneered by Benjamin (1989) and it has recently been actively 
investigated by many researchers (Miller and Benjamin, 1992; Devenyi and Schlatter, 
1994; Riishojgaard, 1998; Otte et ah, 2000). 

In this method the isotropic, stationary covariance model described in the preceding 
paragraphs, is modified through multiplication by an anisotropic, flow-dependent term 
that is a function of the background field potential temperature. Eor example, the 
temperature autocovariance function (2.29) is modified as follows: 


Cov(n Tj) = ar' R(x) /'^(Pi,P}) y(l I ^(npi) - ^(r.pj) U) (2.95) 

In the present implementation the anisotropic component v is modeled by a 
simple SOAR function of the absolute difference of the background potential 
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temperatures at loeations ij. This is a eomputationally eheap way of obtaining flow- 
dependent eovarianees, and it relies heavily on the aeeuraey of the first guess potential 
temperature field in order to produee positive results. However, it might be argued that 
also in the Kalman filter approaeh the time evolution of the analysis error eovarianee 
matrix depends on the accuracy of the previous analysis step and of the model itself 
(actually, a linearized version of the model). We note in passing that the 3-D character of 
this formulation should also be helpful in the objective analysis of single level 
observations in the mass-wind analysis (SYNOP, SHIP, BUOY, AIREP, etc), where the 
use of the statistically derived vertical correlation functions (2.67-2.72) can be 
detrimental when it does not take into account the presence of sharp air-mass boundaries 
(i.e. boundary layers with sharp inversions, tropopause boundary). 

An example of how the flow-dependent term impacts the objective analysis is 
now discussed. The main feature of the synoptic state of the atmosphere over the model 
domain on the 19* of June 2002, OOUTC, is the elongated upper level trough whose 
cyclonic part extends over northwestern Spain, the British Isles and Scandinavia (Fig. 
2.11). In the lower troposphere this is mirrored by a sharp air mass boundary extending 
from Spain trough France, Germany and the southern portion of the Scandinavian 
Peninsula (Fig. 2.12). In this assimilation cycle a radiosonde near Paris reported a 500 
hPa temperature 1.5°C higher than the forecast temperature. The way in which this 
observation increment is spatially interpolated in the ensuing objective analysis can be 
seen in Fig. 2.13. The air mass boundary present at 500 hPa (Fig. 2.14) clearly models 
the shape of the correlation function in an anisotropic and flow-dependent way. 

Although these results look promising and intuitively appealing, the merits of the 
method must be evaluated in an objective way, through comparisons of statistical skill 
scores of the forecast fields derived from objective analysis performed both with and 
without the flow-dependent term. This work is in progress. 
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Figure 2.11 CNMCA 500 hPa Geopotential height and Temperature Analysis: 
June 19*’’ 2002, OOUTC. 
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Figure 2.12 CNMCA 850 hPa Wet Bulb Potential Temperature Analysis: June 


19*2002, OOUTC. 
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ROME Analysis: Mercoledi 19 Giugno 2002 


T SOOhPa Analysis: increment over First Guess 



Figure 2.13 CNMCA 500 hPa Temperature Analysis Increment over First 
Guess: June 19* 2002, OOUTC. 
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Figure 2.14 CNMCA 500 hPa Potential Temperature Analysis: June 19* 2002, 
OOUTC. 
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III. BACKGROUND AND OBSERVATION ERRORS 

STATISTICS 


In order to maximize the amount of information extracted from the observations 
and, at the same time, reject spurious noise (i.e., information related to spatial scales 
which the numerical model and, consequently, the objective analysis algorithm, cannot 
resolve), it is vital that the background (P^ ) and the observation (R) error covariance 
matrices be specified as accurately as possible. 

Recalling their definitions; 

Pb=<ebeJ> (3.1) 

R=<eoeo^> (3.2) 

and the fact that both errors Cb and Co refer to unknown, unbiased, “true” values, it is clear 
that we cannot compute these quantities from first principles. What is done in practice is 
to estimate these covariances by a statistical procedure: we are assuming that these errors 
are stationary in time and uniform over our spatial domain in order to derive their 
“climatology”. From this and the underlying assumptions that the errors are normally 
distributed and unbiased, we are then able to compute the second moments (variances and 
covariances) of their probability density functions (pdf). 

In practice this calculation can be done in at least three different ways: 

1. The ^"Observation method" (Rutherford, 1972; Hollingsworth and Lonnberg, 
1986). In this procedure observation-minus-first guess differences are collected 
for a period of time over a network of homogeneous and uncorrelated observing 
stations. From this data, which can be further stratified by pressure level and time 
of the year, the spatial statistics of the covariances can be computed, together with 
the perceived background and observation error variances; 
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2. The ‘"NMC method’’’ (Parrish and Berber, 1992). The main assumption of the 
method is that the statistical structure of the background correlations does not 
change significantly over the first 48h of model integration. Then it is possible to 
derive the background covariances from a statistical analysis of the differences of 
the 48h and 24h forecast fields verifying at the same time. The method is suitable 
for global numerical prediction systems and also very practical and 
straightforward to set up. On the other hand the main hypothesis it rests upon is 
rather difficult to justify; 

3. The ''Analysis-Ensemble method’’ (Fisher, 2001). This method is the one 
currently used at ECMWF. An ensemble of different analyses is realized by 
randomly perturbing the initial observations within the assumed observation error. 
These different analyses are then integrated in time till the next objective analysis, 
where the process is repeated. After a few days the differences in the first guess 
fields should be representative of the underlying background error statistics. 

The method chosen to specify the parameters involved in the modeling of the 
background and observation error matrices is the Observation method. The reason for 
this is that it is a direct and theoretically sound way of deriving the background spatial 
correlations and, as a bonus, it is also capable of providing an estimate of the background 
and observation variances, whose relative magnitude is perhaps the most fundamental 
quantity to be specified in every objective analysis algorithm. 

The main steps in the computations and the more relevant results will be 
described in the following paragraphs. For more details, see Vocino (2002). Also an 
account of the observations currently used and their assumed error statistics will be given. 


A. BACKGROUND ERROR STATISTICS 

In order to derive the necessary statistics on the background error, the 
"Observation method' has been used. The entire network of land radiosonde stations 
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present in the analysis domain (165) has been exploited in the eomputations: the data 
refer to the OOZ and 12Z synoptic times over a three month period starting from the 13*’’ 
of March 2002. The temperature 6-h observation increments (denoted as Ok-Bk) were 
collected for the standard pressure levels (1000 hPa, 925 hPa, 850 hPa, 700 hPa, 500 hPa, 
400 hPa, 300 hPa, 200 hPa, 100 hPa, 50 hPa). 

After removing the bias for each station, the following estimate of observation 
increments correlations was computed for each pair of stations k,l: 

Rm = <(Ok-Bk) (Oi-Bi)> /(<(Ok-Bk)^>‘^'< (Oi-Bif>^^^ ( 3.1) 


Due to the separability assumption, we can study the quasi-horizontal (isobaric) 
correlations independently from the vertical correlations. 

In order to study the isotropic component of the isobaric correlations Ru, they 
have been partitioned into intervals of 0.01 rad (~200 Km) of their mutual great circle 
distance. In each of these intervals the Fisher z-transform of the empirical correlation 
coefficients has been performed: 


Z = l/2log((l+R)/(l-R)) 


(3.2) 


This has been done in order to preserve the assumed normal distribution of the 
correlation coefficients around their mean value, and it has proved beneficial in the 
successive fit of the correlation models to the empirical data (Fig 3.1). 
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Figure 3.1 Example of correlation fit -with arithmetical and Fisher z-transform 
mean. 


The model functions fitted to the empirical correlation coefficients have been the 
Negative Squared Exponential (NSE) function: 

Pb= exp(-0.5(r/Lc) (3.3) 

and the Second Order Autoregressive (SOAR) function: 

Pb= (1 +r/Lc)exp-(r/Lc) (3.4) 

What -we are trying to determine is the correlation length Lc, 'which can be defined 
for any correlation model as (Daley, 1991): 

L,^(-2p/V^pf\=o (3.5) 


42 







where the Laplaeian operator due to the isotropy assumption, reduces to: 


S^=l/r d/dr(rd/dr) (3.6) 

Sample results of the fits are given below (distances are in radians units of Earth’s 

radius). 


T-T Correlation Function at 0300 hPa 



Figure 3.2 Correlation fit for isobaric (300 hPa) Temperature Observation 
increments. 
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T-T Correlation Function at 0500 hPa 
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Fig. 3.3: Correlation fit for isobaric (500 hPa) Temperature Observation increments. 


T-T Correlation Function at 0850 hPa 



Fig. 3.4: Correlation fit for isobaric (850 hPa) Temperature Observation increments. 
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The main quantitative results are summarized in the following two tables: 


Pressure Level (hPa) 

Lc (Km) 

Fit rmse 

1000 

713 

0.0618 

925 

339 

0.0321 

850 

426 

0.0362 

700 

378 

0.0244 

500 

409 

0.0304 

400 

368 

0.0250 

300 

517 

0.0441 

200 

539 

0.0214 

100 

893 

0.0488 

50 

809 

0.0421 


Table 3.1 Correlation lengths and rmse of fit for Temperature Observation 
inerements: NSE correlation model. 
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Pressure Level (hPa) 

Lc (Km) 

Fit rmse 

1000 

417 

0.0560 

925 

201 

0.0259 

850 

261 

0.0291 

700 

221 

0.0186 

500 

242 

0.0251 

400 

220 

0.0195 

300 

296 

0.0443 

200 

318 

0.0251 

100 

544 

0.0329 

50 

482 

0.0281 


Table 3.2 Correlation lengths and rmse of fit for Temperature Observation 
inerements: SOAR correlation model. 


From inspection it is clear that the SOAR function gives a better fit to the empirical 
correlations at almost all levels: this is mainly due to the better agreement of the SOAR 
model function with the data at intermediate distances (0.04-0.08 radians), where the 
NSE function is seen to fall off too steeply. The correlation lengths have been plotted for 
ease of reference in Fig. 3.5. 
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Vertical Profile of T-T Correlation Length Scale Parameter 



Figure 3.5 Vertical profiles of Correlation Lengths for Temperature Observation 
increments. 


Apart from the anomalously high value at 1000 hPa (whose cause is under 
investigation), the correlation lengths appear to have a plausible height profile. They are 
approximately constant over much of the troposphere and increase considerably in the 
stratosphere. The lower correlation length values shown by the SOAR model agree with 
the more gradual incline of the function at large distances, where the linear term becomes 
dominant. 

For the experimental runs of the objective analysis procedure, the SOAR model 
has been selected, with a constant tropospheric correlation length of Lc = 250 Km and a 
constant stratospheric (i.e./f/ev^250 hPa) correlation length of Lc = 400 Km. 

Another appealing feature of the "^Observation method" is the possibility of deriving 
objective estimates of the background and observation errors’ variances, which arguably 
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are the most important parameters to be speeified in an objeetive assimilation algorithm. 
How this is done follows from the definition of the eorrelation eoeffieient Rm (3.1): 
assuming homogeneity of the errors and uneorrelated observation errors (as is the case 
for independent radiosonde measurements), it can be shown (Daley, 1991) that: 


Rz =limr^Rki(r) = Eb^/(Eb^+Eo^) (3.7) 

where Eb^,Eo^ are the background field and observation variances respectively. This is 
what can intuitively be expected, since the observations are mutually uneorrelated: for 

distances r close, but not equal to 0, the correlation can only be explained by the 

background term contribution, so that extrapolating to zero distance gives the relative 
weight of the background term with respective to the total (background + observation) 
variance. 

On the other hand the total variance of the observation increments can be easily 
computed from: 

W = 1/K Ek<(Ok-Bk)^> (3.8) 

so that, using (3.7-3.8) each variance can be calculated. The results of these computations 
have been summarized in Tables 3.3 and 3.4. 
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Pressure Level 
(hPa) 

EU°C) 

Eo(°C) 

E,(°C) 

1000 

1.52 

2.19 

2.67 

925 

1.74 

1.19 

2.36 

850 

1.26 

1.31 

1.82 

700 

1.24 

1.10 

1.66 

500 

1.46 

1.58 

2.16 

400 

1.56 

1.64 

2.26 

300 

2.24 

1.37 

2.63 

200 

2.72 

0.94 

2.88 

100 

1.15 

1.16 

1.63 

50 

1.25 

1.32 

1.82 


Table 3.3 Background error, Observation error and Total perceived error for 
Temperature Observation increments: NSE correlation model. 


Pressure Level 
(hPa) 

EU°C) 

Eo(°C) 

Et(°C) 

1000 

1.62 

2.12 

2.67 

925 

1.86 

1.45 

2.36 

850 

1.33 

1.24 

1.82 

700 

1.32 

1.00 

1.66 

500 

1.56 

1.49 

2.16 

400 

1.66 

1.54 

2.26 

300 

2.39 

1.08 

2.63 

200 

2.87 

0.21 

2.88 

100 

1.20 

1.10 

1.63 

50 

1.32 

1.25 

1.82 


Table 3.4 Background error, Observation error and Total perceived error for 
Temperature Observation increments: SOAR correlation model. 
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pressure (hPa) 


Also these values have been plotted for ease of referenee in Fig.3.6 and 3.7. 


Vertical Profile of T Root Mean Square Errors 



Figure 3.6 Vertieal profile of Baekground error, Observation error and Total 
pereeived error for Temperature Observation increments: NSE 
correlation model. 
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Vertical Profile of T Root Mean Square Errors 



Figure 3.7 Vertical profile of Background error, Observation error and Total 
perceived error for Temperature Observation increments: SOAR 
correlation model. 

From the plots it is clear that, with the exception of the 200 hPa level, the 
background and observed root mean square errors are fairly constant with height and of 
comparable magnitude, in the range of 1.2-1.5°C: these values are compatible with the 
expected accuracy of radiosonde observations and 6-h forecast fields. 

The statistical calculation of the parameters in the vertical component has been 
performed in analogous fashion. Since, from inspection of 2.67-70, 2.78-80 and 2.85-86, 
it is clear that all vertical correlations can be expressed in term of X^(Pi’Pj) = /“"(PiPj), 
the observed vertical correlations of the geopotential background increments have been 
fitted to the model function (1 +kplog^(pi/pj)X^ (Fig.3.8). 
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Fit of Height Background Error Vertbai Correiation 



Fig. 3.8 ; Fit of vertical profile of geopotential height background minus 
observation increments: equation (2.68) correlation model. 


The fit is satisfactory (RMSE = 0.101), giving for the fitting parameter kp= 1.77, 
which is the value used in the experimental trials of the objective analysis scheme. 


B. OBSERVATION ERROR STATISTICS 

Since the new objective analysis scheme is still in an experimental stage and it 
was felt it was important to validate the merits of the new algorithm and the revised 
background statistics and correlation functions, the observations used are the same 
ones used in the current operational Optimum Interpolation analysis. A brief account 
of their error statistics will be given below. 
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1. Radiosondes and Pibals 

Even today, radiosonde reports (TEMP) are the main eomponent of the global 
weather observing system in the Northern Hemisphere. On the main synoptic hours (00 
and 12 EITC) about 165 reports are routinely analyzed (around 80 reports at 06 and 18 
UTC synoptic hours). Mandatory and significant level observations of temperature, 
relative humidity and wind are assimilated. Observations at levels not coincident with 
isobaric analysis levels are vertically (linearly in log(p)) interpolated. Erom the results 
shown in the previous section, the assumed accuracy (RMSE) of the temperature 
observations have been set to 1.2°C; no radiative corrections are employed. Eor relative 
humidity the assumed accuracy has been set to 5%. Perceived winds accuracy varies from 
2.1ms'^ in the lower levels to 3.5 ms'^ in the stratosphere. 

A major advantage of the new objective analysis is that it employs temperatures 
instead of geopotential heights, as the previous one did. This brings about two positive 
features: the temperature observation errors are much less vertically correlated than the 
geopotential errors and the background error correlations for temperature are sharper than 
the corresponding geopotential correlations. The end result of these two factors is that 
higher vertical resolution analysis is possible. Also, on a more technical level, making the 
assumption of vertically uncorrelated temperature observation errors drastically improves 
the condition number of the analysis equations, thus greatly reducing the computation 
time required for the minimization of the cost function. Pilot weather balloons (Pibals) 
wind reports are also routinely analyzed, with expected observation errors somewhat 
greater than those assigned to radiosonde reports. 

2. Aircraft Based Observations 

Aircraft based temperature and wind observations are routinely assimilated, after 
they have undergone the preprocessing step of the analysis cycle, where consistency 
checks on the moving platform reported position and climatological gross limit checks 
are performed. 

Automatic observations (AMDAR, Aircraft Meteorological Data Relay) are given 
greater weight (i.e., smaller observation RMSEs) over manual observations (AIREP, 
Aircraft Reports). Both are treated as uncorrelated, single level reports. 
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3. Atmospheric Motion Winds 

Atmospheric motion winds are derived from tracking the movement of identified 
cloud structures from a sequence (i.e. three) of geostationary satellite images. These 
observations are not routinely analyzed in the current operational objective analysis, but 
their use is envisaged in the new data assimilation system. It is felt, however, that, due to 
the current uncertainties in the height assignment of the winds, research is still needed in 
order to design an observation operator H able to extract all the relevant information. 

4. Conventional Surface Observations 

Conventional surface observations, both over land (SYNOP) and over the sea 
(SHIP, BUOYS) are routinely analyzed. After preprocessing and superobbing the reports 
closer than the average grid spacing (~55 Km), around 1200 observations are fed into 
objective analysis algorithm. For the upper air analysis surface pressure innovations are 
transformed into geopotential innovations of the closest analysis level and their impact on 
the analyzed variables {T,u,v) is evaluated through the use of the correlations 2.78-2.90. 

Surface fields (Surface pressure. Mean Sea Level pressure, 10-m wind) are 
analyzed through a two-dimensional, multivariate version of the algorithm (see Section 
II.C.2). Winds are only assimilated over sea. At the moment geostrophy is not strictly 
enforced, using a geostrophic coupling parameter ju=0.7. However it is felt that a more 
accurate estimate of the geostrophic, balanced component of the observed winds is 
needed in order to make good use of the wind observations. Ways to accomplish this by 
taking into account the structure of the forecast marine boundary layer are currently being 
investigated. 

The Mean Sea Level pressure (MSLP) is well defined over the sea but, over land, 
is more of a visual aid for forecasters than a real meteorological field. However it is felt 
to be important that the analyzed MSLP field closely draws to the accepted MSLP and 
surface wind observations. To accomplish this, it has been found necessary to objectively 
analyze the MSLP and surface wind departures. Reduction of surface pressure (SP) 
analysis to mean sea level did not give satisfactory results, mainly because of the 
differences between model temperatures and observing station temperatures. 
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5. Scatterometer Winds 

Scatterometers are satellite-borne radars, whieh provide measurement of surface 
wind speed and direction over sea. There is ambiguity, however, in the wind directions, 
and four different wind directions are compatible with each measurement. This ambiguity 
can be resolved either by choosing the direction closer to the first guess or by more 
advanced methods that take into account the spatial coherence properties of the retrieved 
wind field. 

At the moment work is under way to assimilate the wind field product generated 
by the Dutch Meteorological Institute (KNMI) from the QuikSCAT satellite observations 
(SeaWinds). This is “...a near real-time 100-km resolution QuikSCAT product, which 
includes inversion. Quality Control, and a 2D-Var ambiguity removal algorithm...” 
making it suitable for data assimilation purposes (for more information, see 
www.knmi.nl/onderzk/applied/scattmtr/quikscat/index.html ). 
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IV. VALIDATION AND PRELIMINARY RESULTS 


There are many possible ways through whieh the quality of the objeetively 
analyzed fields can be gauged; 

1. A subjective, “synoptic” evaluation of the charts produced by the data 
assimilation cycle from a forecaster’s perspective, assessing the adherence to the 
observations (especially those not used in the analysis scheme, such as satellite 
imagery) and the degree of enforcement of “physical” balance properties; 

2. Use of internal diagnostics, such as the analysis error covariance matrix, which 
can be computed as 

Pa = (Pb‘+H^R‘H)‘ (4.1) 

It is common to compute only the diagonal elements of the matrix, which give an 
estimate of the variances of the analyzed variables. Unfortunately eq. (4.1) strictly 
holds on condition that the background and observation error covariance matrices 
{Pb,R) have been correctly specified, which is not usually the case. If this is not 
true, then eq.(4.I) underestimates the real analysis error covariances 

3. A statistical, “objective” verification through comparison of forecasts produced 
from the analyzed fields with other forecasts started from independent data 
assimilation cycles. 

This later approach has been taken in this study. Details of its implementation will 
be given below, together with a discussion of the results so far obtained. 
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A. METHOD OF VERIFICATION 

In order to assess the quality of the analysis fields in an objective manner, two 
parallel runs of the HRM model (see Section I.A.3) have been set up, running at OOUTC 
every day up to T+48h: one starting from the analysis fields of the new data assimilation 
cycle, the other from the analyzed ECMWF fields interpolated to the HRM model grid. 
Apart from the initial conditions, all the other features of the two model integrations are 
equal (boundary conditions, resolution, etc.). This should guarantee that any difference in 
the subsequent forecast fields should be traced back to differences in the initial 
conditions. The accuracy of the forecast fields is estimated through the use of two 
common scalar measures of gridded fields: the root mean square error (RMSE) and the 
anomaly correlation (AC). 

Denoting by/y the grid values of the forecast field and ay the grid values of the 
verifying analysis, then the RMSE is given by (Wilkes, 1995): 

RMSE = - ay)f^ (4.1) 

This skill score is what we might define as an absolute measure of the forecast’s 
quality. However it is not able to distinguish if the errors are related to biases in the 
forecast fields or to the misplacement of significant weather patterns. In order to do this 
the anomaly correlation score is useful (Wilkes, 1995). If we define Cy the climatological 
averages of the analyzed fields at each grid point, then: 

AC = (E=I.M^j=J.N(fii- Cy) (ay- Cy))/(( E i=j,MEj=J,N(fy - Ct/ (E=j,MEj=J,N(ay- Ci/)) 

(4.2) 

From this expression it is clear that we are investigating the correlations between the 
anomalies with respect to climatology of the forecast and the analyzed fields. In this way 
we are highlighting the pattern similarities between the two fields, while giving less 
weight to their absolute values. 

A final point to be made is that the verifying analyses are the ECMWF analyses 
interpolated on to the HRM grid. This choice has been made in order to test the quality of 
the forecast fields derived from the data assimilation cycle under their most unfavorable 
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conditions. The statistical errors so derived should then represent an upper limit to the 
expected forecast errors. 


B. PRELIMINARY RESULTS 

Some results from a statistical comparison of the model runs over 30 cases (15* 
June 2002 - 15* July 2002) are shown in Fig.4.1 through 4.8 for the mean sea level 
pressure, geopotential height and wind speed fields. From inspection of the charts, it is 
evident that: 


1. Degradation in quality with respect to the ECMWF based model run is within 
acceptable limits (~ 0.7 hPa for MSLP, from 5 to 10 gpm for geopotential height, 
from 0.9 m/s at 850 hPa to 2.3 m/s at the jet level height for wind speed); 

2. The anomaly correlation scores for both the MSLP and 500 hPa geopotential 
height field are very close: this is an indication that the location of weather system 
is correctly placed in the CNMCA analysis’ derived forecast fields and that the 
main source of error is to be found in the diagnosed intensities; 

3. The error evolution is smooth in time, indicating that the analysis fields are 
perturbing the first guess fields in a “balanced” and physically reasonable manner. 
The gradual decrease in the RMSE differences between the two model runs is to 
due to the steady increase of influence of the common ECMWE derived boundary 
conditions. 


It should also be borne in mind the different spatial and vertical resolution of the 
ECMWE analyses (which are the results of 4D-Var minimizations at half the operational 
model resolution) with respect to the CNMCA analyses. The RMSE scores thus 
incorporate an intrinsic error of representativeness whose magnitude has not yet been 
determined. 
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MSLP RMSE 


CNMCA_ANA — BCMWF_ANA 



Figure 4.1 RMSE of Mean Sea level pressure forecast fields vs. ECMWF 
analysis. 


Z850 hPa RMSE 


CNMCA_ANA — ECMWF_ANA 



Figure 4.2 RMSE of 850 hPa geopotential height forecast fields vs. ECMWF 
analysis. 
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Z500 hPa RMSE 

— CNMCA_ANA — ECMWF_ANA 



Figure 4.3 RMSE of 500 hPa geopotential height forecast fields vs. ECMWF 
analysis. 


Z300 hPa RMSE 

— CNMCA_ANA — ECMWF_ANA 



Forecast time 


Figure 4.4 RMSE of 300 hPa geopotential height forecast fields vs. ECMWF 
analysis. 
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U850 hPa RMSE 


CNMCA_ANA — ECMWF_ANA 



Forecast time 


Figure 4.5 RMSE of 850 hPa wind speed forecast fields vs. ECMWF analysis. 


U500 hPa RMSE 


CNMCA_ANA — ECMWF_ANA 



Figure 4.6 RMSE of 500 hPa wind speed forecast fields vs. ECMWF analysis. 
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U300 hPa RMSE 


— CNMCA_ANA — ECMWF_ANA 



Figure 4.7 RMSE of 300 hPa wind speed forecast fields vs. ECMWF analysis. 


MSLP Anom. Corr. 


CNMCA_ANA — ECMWF_ANA 



Figure 4.8 Anomaly Correlation of MSEP forecast fields vs. ECMWE analysis. 
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Z 500 hPa Anom. Corr. 


CNMCA_ANA — ECMWF_ANA 



Figure 4.9 Anomaly Correlation of 500 hPa Geopotential height forecast fields vs. 
ECMWF analysis. 
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V. CONCLUSIONS AND SUGGESTIONS FOR FURTHER 

DEVELOPMENT 


In this work the main ideas behind the new data assimilation eyele being 
implemented at Italian Air Foree Weather Serviee have been presented. The system is in 
a relatively early stage of development and many important eomponents are still missing 
in order to make it truly effeetive and operationally viable. However it is felt that the 
basie building bloeks of the system have been set up: 

1. The pre-proeessing and quality eontrol of the observations; 

2. The new 3-D eovarianee model in spherieal eoordinates; 

3. The deseent algorithm for the minimization of the eost funetion; 

4. The statistieal evaluation of the baekground error eovarianees. 

At this point, the development effort will eoneentrate mainly on the following 
areas, whieh are deemed to be the most urgent requirements for an operational use of the 
new system: 

1. The implementation of an effeetive buddy eheek algorithm for the sereening of 
marginal observations. Given the global nature of the analysis seheme, a loeal 
method sueh as that first proposed by Lorene (1981) would not be appropriate. 
Approximations to modern methods following Daley and Barker (2000, Seetion 
9.3) are being investigated. 

2. The implementation of an effeetive and easy to eompute preeonditioner for the 
minimization of the eost funetion is also being pursued. Currently the time 
required for the minimization algorithm is within aeeeptable limits, but it is 
expeeted that the inelusion of a larger number of satellite observations will ehange 
this. 
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3. The extension of the algorithm to observations that are not linearly related to the 
state variables. This work is well under way for seatterometer winds and will be 
started for column precipitable water observations and satellite radiances. The 
direct assimilation of satellite radiances is a long term goal whose main value is 
not expected to be so much in improving the current analyses over the present 
analysis domain as in the experience to be gained in view of the future availability 
of much improved observations from satellite hyperspectral sounders. 

The results shown in the previous section indicate that the skill of the forecast 
fields derived from the new assimilation cycle still lags the skill of ECMWF derived 
forecasts by 18-24 h when verified with respect to ECMWF analyses. As mentioned in an 
earlier section, part of this is due to representativeness error. It is felt that this error can be 
reduced within 6-12h when more observations will have been added and some parameter 
optimizations will have been performed. 
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